keywords:"transformer|language model|GPT-2|visualization" - Search Results

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"transformer|language model|GPT-2|visualization"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

National Repository of Grey Literature	1 records found	Search took 0.01 seconds.

Analysis and visualization of the GPT-2 language model
Šipoš, Daniel ; Mareček, David (advisor) ; Rosa, Rudolf (referee)
Visualization of deep neural network models with Transformer architecture is generally a very demanding task which is usually solved by visualizing attention blocks and moni- toring which words these block focus on. However, Transformer models have many layers and there are multiple attention heads on each layer. Therefore, each head may attend to different linguistic features. In this work, we focus on developing an application that is designed to visualize the behaviour of GPT-2 language models more clearly. We propose four visualization methods that examine the dependencies of generated words on pre- vious words in the text. We monitor these dependencies by removing one of the words in the previously generated text or replacing it with a similar word and then observing changes of the probability of the generated word. We show the results of our methods produced on the GPT-2 Medium model and formulate hypotheses with the aim to explain them. 1

Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English